A New Framework for Domain Adaptation without Model Retraining
نویسندگان
چکیده
We propose a principled and effective domain adaptation framework that pursues the goal of Open Domain NLP (train once, test anywhere). Most domain adaptation frameworks adapt the models trained on the source domain data by retraining it on target domains (with a mix of labeled and unlabeled data). However, it is time consuming to retrain big models or pipeline systems, and may not even be feasible if you consider a streaming data that may not be coherent (e.g., web data). We propose an adaptation framework that does not require retraining the original model. Instead, our approach adapts the target domain input so that it is more similar to the source domain, while preserving the labeling, thus increasing the accuracy of the original model when evaluated on target data. Our experiments on the named entity recognition task in scientific domains show an absolute F1 improvement of 13% over a state-of-the-art named entity recognizer. We also show that without any retraining, the proposed method outperforms the bootstrapping based adaptation method of (Jiang and Zhai, 2007b) that requires multiple rounds of retraining on the target domain data.
منابع مشابه
Domain Attention with an Ensemble of Experts
An important problem in domain adaptation is to quickly generalize to a new domain with limited supervision givenK existing domains. One approach is to retrain a global model across all K + 1 domains using standard techniques, for instance Daumé III (2009). However, it is desirable to adapt without having to reestimate a global model from scratch each time a new domain with potentially new inte...
متن کاملDictionary-based Domain Adaptation of MT Systems without Retraining
We describe our submission to the ITdomain translation task of WMT 2016. We perform domain adaptation with dictionary data on already trained MT systems with no further retraining. We apply our approach to two conceptually different systems developed within the QTLeap project: TectoMT and Moses, as well as Chimera, their combination. In all settings, our method improves the translation quality....
متن کاملSample-oriented Domain Adaptation for Image Classification
Image processing is a method to perform some operations on an image, in order to get an enhanced image or to extract some useful information from it. The conventional image processing algorithms cannot perform well in scenarios where the training images (source domain) that are used to learn the model have a different distribution with test images (target domain). Also, many real world applicat...
متن کاملDeveloping A New Operation-Economic Framework for Irrigation Networks without Water Market
This study focused on proposing a new operational perspective within main and lateral irrigation canals based on the economic value of water. To achieve this objective, the operation-economic framework offered in this study consisted of two main components of the PMP model and Operation model. The estimated economic values of water in different regions of the network were employed as the starti...
متن کاملAdapting Text instead of the Model: An Open Domain Approach
Natural language systems trained on labeled data from one domain do not perform well on other domains. Most adaptation algorithms proposed in the literature train a new model for the new domain using unlabeled data. However, it is time consuming to retrain big models or pipeline systems. Moreover, the domain of a new target sentence may not be known, and one may not have significant amount of u...
متن کامل